AI and LLM Capabilities
Cyberhaven’s AI and LLM capabilities help organizations improve their incident management workflows. Our AI seamlessly integrates with user-defined policies to autonomously analyze data flows, detect anomalies, and generate incidents. The LLM complements this process by generating natural language summaries, enabling security teams to quickly understand and respond to incidents efficiently.
Linea AI Overview
Linea AI is an autonomous agent built on our data lineage platform, designed to transform incident management workflows. By prioritizing, analyzing, and summarizing incidents, Linea AI ensures critical risks are promptly detected. While traditional incident management relies on user-defined policies to generate incidents, gaps in these policies often leave significant risks undetected. Linea AI bridges this gap by autonomously detecting anomalous data flows, even without predefined policies or datasets, delivering a proactive and adaptive layer of security.
Note Linea AI is an optional add-on feature and requires a separate license. Contact your Cyberhaven Sales Representative for licensing and purchase details.
Features
-
Incident Detection and Alerts: Linea detects anomalies in the data flow based on historical events and creates incidents, even when no predefined policies or data classifications are in place. For example, it can flag risky user actions or data transfers that were not previously covered by existing policies, providing proactive protection.
-
Incident Prioritization: Linea assesses incidents by evaluating the data flow and determining the severity level. For example, a user uploading their personal tax form to a personal email account is classified as low risk, while attaching sensitive documents like source code is identified as high risk. This prioritization surfaces the most critical incidents that require immediate investigation while deprioritizing low or informational risks.
-
Analyzing and Summarizing Incidents: Linea provides a detailed summary of each incident, including its root cause. The summary contains key information such as the user action that triggered the incident, the location, the type of content (based on Linea's assessment of its security risk), and the destination of the data. This rich context accelerates the investigation process by giving analysts a clear understanding of the incident at a glance.
Benefits
- Proactively Mitigate Risks: Detect and address risks that would otherwise go unnoticed.
- Identify Critical Incidents: AI-driven risk assessment enables analysts to prioritize the most impactful incidents, enabling faster response.
- Reduce Incident Resolution Time: Summarized incidents provide clear root cause analysis, empowering teams to resolve issues more efficiently.
How Linea AI Works
Linea AI uses machine learning to study historical events generated by the Cyberhaven platform and detect deviations from typical data flows.
At its core is the proprietary Large Lineage Model (LLiM), which evaluates the probability of each event and flags anomalous incidents. When LLiM detects a low-probability flow, it forwards the event to a large language model (LLM) that evaluates the semantics of the data movement and compares it against historical patterns to determine severity. Linea AI is deployed privately within each customer’s secure Google Cloud Platform (GCP) environment, ensuring strict data separation—each instance only analyzes the customer’s own historical events.
Example: Identifying Deviations in Data Flows
The scenarios below illustrate how Linea AI distinguishes between a normal data flow and one that deviates from expected behavior.
Normal flow
- Jane, the CFO, creates a sensitive file named
2024_executive_equity_awardsin a corporate Google Sheets document. - John, the Corporate Accountant, downloads the sheet as an XLS file.
- John attaches the file to an email and sends it from his corporate email account to the HR team.
In this case, LLiM determines there is a high probability that the flow matches historical behavior—sharing sensitive files internally via sanctioned tools such as corporate email. No incident is created.
Deviated flow
- Jane, the CFO, creates a sensitive file named
2024_executive_equity_awardsin a corporate Google Sheets document. - John, the Corporate Accountant, downloads the sheet as an XLS file.
- John attaches the file to an email and sends it from his corporate email account to the HR team.
- John later opens the sheet, copies text from it, and pastes the data into a Telegram chat window.
Here, LLiM detects a very low probability that sensitive data is being shared through an unsanctioned channel such as Telegram. LLiM forwards the event to the underlying LLM, which evaluates the semantics of the action and determines whether it has Critical or High severity. If no existing policy covers this behavior, Linea AI autonomously creates an incident to alert the security team.
Continuous learning
By continuously analyzing metadata—including file names, locations, user roles, and historical events—LLiM refines its understanding of both expected and anomalous behaviors. This allows Linea AI to surface deviations promptly, even without predefined policies.
Architecture
The architecture diagram below illustrates how LLiM and the LLM work together alongside the policy engine:
- LLiM runs independently and continuously, monitoring events for anomalies.
- When LLiM detects an anomalous event, it forwards the event to the LLM.
- The LLM compares the anomalous event against historical data, determines whether it meets the Critical or High severity threshold, and generates an incident with a natural-language summary when warranted.
- For incidents triggered by user-defined policies, the policy engine follows the same process—forwarding events to the LLM for risk assessment and summary generation.
Linea AI operates entirely within each customer’s GCP environment. Customer data is never used to train external models or models for other customers, and the LLM observes a strict zero-data-retention policy.
